Decoupled iteration mapping: improving dependency-loop performance on SIMD processors

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Database Performance on Simultaneous Multithreading Processors

Simultaneous multithreading (SMT) allows multiple threads to supply instructions to the instruction pipeline of a superscalar processor. Because threads share processor resources, an SMT system is inherently different from a multiprocessor system and, therefore, utilizing multiple threads on an SMT processor creates new challenges for database implementers. We investigate three thread-based tec...

متن کامل

Improving Search Engines Performance on Multithreading Processors

In this paper we present strategies and experiments that show how to take advantage of the multi-threading parallelism available in Chip Multithreading (CMP) processors in the context of efficient query processing for search engines. We show that scalable performance can be achieved by letting the search engine go synchronous so that batches of queries can be processed concurrently in a simple ...

متن کامل

Iteration Mapping: Loop Software Pipelining on an XIMD

The multiple instruction streams, low synchronization cost and synchronous nature of the XIMD (variable instruction stream, multiple data stream) architecture create an opportunity for a new architecture-compiler interface. As an extension to the VLIW (Very Long Instruction Word) architecture, the XIMD can exploit all VLIW scheduling techniques but these do not take full advantage of the unique...

متن کامل

Decoupled Value Prediction on Trace Processors

Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction, and executes speculatively its data-dependent instructions based on the predicted outcome. In this paper, we address several implementation issues for value prediction which are important on wide-issue superscalar architectures, and present a value prediction scheme based on the trace ...

متن کامل

Improving Memory Performance for Indirect Accesses on SIMD Computers

SIMD machines operate more efficiently on a wider range of problems when they have the ability to access memory with both global and local addresses. Recent work has made possible the use of caches for global addresses. This paper examines techniques for employing caches to improve memory accesses with local addresses. Specifically, we examine the improvement from utilizing a clusterbased indir...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEICE Electronics Express

سال: 2013

ISSN: 1349-2543

DOI: 10.1587/elex.10.20130798